Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorect message "getaddrinfo ENOTFOUND" on 4xx sites #4765

Open
1 task done
dominch opened this issue May 16, 2024 · 5 comments
Open
1 task done

Incorect message "getaddrinfo ENOTFOUND" on 4xx sites #4765

dominch opened this issue May 16, 2024 · 5 comments
Labels
area:monitor Everything related to monitors help

Comments

@dominch
Copy link

dominch commented May 16, 2024

📑 I have found these related issues/pull requests

I found many issues regarding faulty dns, but not something like this,

🛡️ Security Policy

Description

I needed to add monitor for site that required both dns and auth headers,
First status I got was:

Offline | 2024-05-15 11:02:03 | getaddrinfo ENOTFOUND test.xyz

That indicates DNS problem and I changed several things to get it to work, still there was same error
I logged into console and checked several things, node was able to resolve site now perfectly, so I was sure that this is not an issue now.

Then I modified monitor to accept 4xx codes as OK so it logged:

Online | 2024-05-15 14:42:10 | 404 - Resource Not Found

At this point I was sure that requests were ok, but missing auth headers, I removed 4xx codes as ok and got this message:

Offline | 2024-05-15 14:42:23 | Request failed with status code 404

While both - DNS and 4xx code is problem I think that system should report last message as it is so it describes real problem.

👟 Reproduction steps

Probably only online/offline changes are reflected in messages, so returning 404 and shortly after that 401 will log only first one.

👀 Expected behavior

It should display latest response indicating problem, here it should give me 4xx code when dns was working so I don't debug this area.

😓 Actual Behavior

First message classified as error is logged

🐻 Uptime-Kuma Version

latest from helm chart - 1.23.11

💻 Operating System and Arch

kubernetes

🌐 Browser

all

🖥️ Deployment Environment

  • Runtime: k8s 1.22, node from helm
  • Database: embedded
  • Filesystem used to store the database on: longhorn, so probably ext4
  • number of monitors: 15

📝 Relevant log output

`
Offline | 2024-05-15 11:02:03 | getaddrinfo ENOTFOUND test.xyz`
Online | 2024-05-15 14:42:10 | 404 - Resource Not Found
Offline | 2024-05-15 14:42:23 | Request failed with status code 404
`
@dominch dominch added the bug Something isn't working label May 16, 2024
@CommanderStorm CommanderStorm added help area:monitor Everything related to monitors and removed bug Something isn't working labels May 16, 2024
@CommanderStorm
Copy link
Collaborator

site that required [...] dns [...] headers

What is a DNS header?

getaddrinfo ENOTFOUND test.xyz

  • What is the TTL of the domains you are using?
  • Do you have DNS caching enabled in the settings?

Most commonly, this issue is caused by you using a DNS resolver which does not like the level of DNS requests it is getting.
=> your DNS Server is dropping SOME requests
=> have you enabled NSCD in the settings to lowered the amount of DNS requests to your TTL (instead of on every request)

Filesystem used to store the database on: longhorn, so probably ext4

Longhorn is not ext4.
I am assuming you are using ReadWriteMany as I cannot find any guarantees about ReadWriteOnce based longhorn in their wiki.
Please refer to the warning from our wiki:

Warning

Filesystem support for POSIX file locks is required to avoid SQLite database corruption. Be aware of possible file locking problems such as those commonly encountered with NFS. Please map the /app/data-folder to a local directory or volume.

@dominch
Copy link
Author

dominch commented May 16, 2024

What is a DNS header?

You stripped to much, to get right response with 2xx code I needed to configure DNS zone in core dns service in kubernetes as well as some auth headers for that site. First step should allow to resolve ip correctly and get 4xx response, second changes that into 2xx code.

Longhorn is not ext4.
I am assuming you are using ReadWriteMany as I cannot find any guarantees about ReadWriteOnce based longhorn in their wiki.

It's ReadWriteOnce, in attached pod stat returns overlayfs for data directory,

@CommanderStorm
Copy link
Collaborator

I think you overlooked the section on getaddrinfo ENOTFOUND test.xyz

@dominch
Copy link
Author

dominch commented May 20, 2024

Is this error for DNS only or also for 404?
I left some links before DNS zone was fixed and all of them still reports that, I'm sure tat it's not cache or anything. I think that 404 message is clear, but I cant see that one until monitor configuration is ok for 404 and roll back.

@CommanderStorm
Copy link
Collaborator

The Incorect message "getaddrinfo ENOTFOUND" [...] is a DNS error if that is what you are asking about.

About the other issue you are having:
The timeline for this one is super unclear to me.
(What did the server return, what did we log)

What is true is that we will not consider DOWN-DOWN transitions as important events => will not push them to the log.
The reason for this is that otherwise you might end up with a wall of notifications that your device is flip-flopping between two down states.
I assume users don't actually want such spam.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:monitor Everything related to monitors help
Projects
None yet
Development

No branches or pull requests

2 participants